Aside

View this CV online with links at duncangates.me/cv/

Contact

Language Skills

R
Python
SQL
Git
C#
Shiny
Tensorflow

Disclaimer

Made with the R package pagedown.

The source code is available on github.com/itsthesnake/cv.

Last updated on 2020-09-10.

Main

Duncan Gates

As a soon-to-be graduate of Oregon State University with focuses in a variety of fields I have found my interests primarily coalesce around translating data into insight, through the application of statistics, coding, and deep understanding of individual issues. After my internship experience this summer working for Data Science for the Public Good1 I have enhanced my focus on team based projects that will have significant positive impacts on the lives of others with the skills I have developed over the course of college.

Currently searching for a position that allows me to build tools leveraging a combination of visualization, machine learning, and software engineering to help people explore and understand their data in new and useful ways.

Education

B.S. in Economics w/ minors in Actuarial Science and Political Science

Corvallis, OR

Oregon State University

2020 - 2017

  • Thesis: The Relationship Between Income Taxation, Social Benefits, and Wealth Inequality in OECD Countries: Exploring the Factors Contributing to Unequal Household Wealth
  • Honors College Student
  • GPA: 3.82

Research Experience

Data Science Intern

Data Science for the Public Good

Oregon State University

2020 - 2020

  • Internship was split into training and research. The first half focused on getting interns up to speed on statistics, machine learning, visualization skills, and teamwork based coding. The second half was based on projects that were assigned to teams composed of two undergraduates, one graduate student, and one mentor (generally a professor).
  • Work was done to measure the impact of the installation of a Selective Water Withdrawal Tower at the Pelton Dam on the Deschutes River. This investigation primarily was based on modelling river temperatures, but also allowed for learning, the team built an R Shiny application, presented at scientific symposiums, and worked collaboratively with officials from PGE, ODFW, ODEQ, and cities on the Deschutes.

Undergraduate Research Assistant

OSU Economics Research

Oregon State University

2020 - 2019

  • Conducting economic research under the guidance of numerous professors
  • Field research to gather panel and time series data

Undergraduate Researcher

Rubenstein Ecosystems Science Laboratory

University of Vermont

2015 - 2013

  • Analyzed and visualized data for CATOS fish tracking project.
  • Head of data mining project to establish temporal trends in population densities of Mysis diluviana (Mysis).
  • Ran project to mathematically model the migration patterns of Mysis (honors thesis project.)

Undergraduate Researcher

LabInTheWild (Reineke Lab)

University of Michigan

2015 - 2015

  • Led development and implementation of interactive data visualizations to help users compare themselves to other demographics.

Undergraduate Researcher

Bentil Laboratory

University of Vermont

2014 - 2013

  • Developed mathematical model to predict the transport of sulfur through the environment with applications in waste cleanup.

Research Assistant

Adair Laboratory

University of Vermont

2013 - 2012

  • Independently analyzed and constructed statistical models for large data sets pertaining to carbon decomposition rates.

Industry Experience

I have studied and worked in a variety of capacities, ranging from political science to computer science and lifeguarding and kayaking to software coding respectively. I enjoy collaborative environments where I can learn from my peers.

Rental Location Manager

Portland, Oregon

Alder Creek Kayak & Canoe

2019 - 2015

  • Spent summers working as a rental location manager for a kayak company.

Engineering Intern - User Experience

Dealer.com

Burlington, VT

2015 - 2015

  • Built internal tool to help analyze and visualize user interaction with back-end products.

Data Science Intern

Dealer.com

Burlington, VT

2015 - 2015

  • Worked with the product analytics team to help parse and visualize large stores of data to drive business decisions.

Data Artist In Residence

Conduce

Carpinteria, CA

2015 - 2014

  • Envisioned, prototyped and implemented visualization framework in the course of one month.
  • Constructed training protocol for bringing third parties up to speed with new protocol.

Software Engineering Intern

Conduce

Carpinteria, CA

2014 - 2014

  • Incorporated d3.js to the company’s main software platform.




Teaching Experience

I am passionate about education. I believe that no topic is too complex if the teacher is empathetic and willing to think about new methods of approaching task.

Javascript for Shiny Users

RStudio::conf 2020

N/A

2020

  • Served as TA for two day workshop on how to leverage Javascript in Shiny applications
  • Lectured on using R2D3 package to build interactive visualizations.2

Data Visualization Best Practices

DataCamp

N/A

2019 - 2019

  • Designed from bottom up course to teach best practices for scientific visualizations.
  • Uses R and ggplot2.
  • In top 10% on platform by popularity.

Improving your visualization in Python

DataCamp

N/A

2019 - 2019

  • Designed from bottom up course to teach advanced methods for enhancing visualization.
  • Uses python, matplotlib, and seaborn.

Advanced Statistical Learning and Inference

Vanderbilt Biostatistics Department

Nashville, TN

2018 - 2017

  • TA and lectured
  • Topics covered from penalized regression to boosted trees and neural networks
  • Highest level course offered in department

Advanced Statistical Computing

Vanderbilt Biostatistics Department

Nashville, TN

2018 - 2018

  • TA and lectured
  • Covered modern statistical computing algorithms
  • 4th year PhD level class

Statistical Computing in R

Vanderbilt Biostatistics Department

Nashville, TN

2017 - 2017

  • TA and lectured
  • Covered introduction to R language for statistics applications
  • Graduate level class

Selected Data Science Writing

I write about data science and visualization methodology on my site.3

Using AWK and R to Parse 25tb4

LiveFreeOrDichotomize.com

N/A

2019

  • Story of parsing large amounts of genomics data.
  • Provided advice for dealing with data much larger than disk.
  • Reached top of HackerNews.

Classifying physical activity from smartphone data5

RStudio Tensorflow Blog

N/A

2018

  • Walk through of training a convolutional neural network to achieve state of the art recognition of activities from accelerometer data.
  • Contracted article.

The United States of Seasons6

LiveFreeOrDichotomize.com

N/A

2018

  • GIS analysis of weather data to find the most ‘seasonal’ locations in United States
  • Used Bayesian regression methods for smoothing sparse geospatial data.

A year as told by fitbit7

LiveFreeOrDichotomize.com

N/A

2017

  • Analyzing a full years worth of second-level heart rate data from wearable device.
  • Demonstrated visualization-based inference for large data.

MCMC and the case of the spilled seeds8

LiveFreeOrDichotomize.com

N/A

2017

  • Full Bayesian MCMC sampler running in your browser.
  • Coded from scratch in vanilla Javascript.

The Traveling Metallurgist9

LiveFreeOrDichotomize.com

N/A

2017

  • Pure javascript implementation of traveling salesman solution using simulated annealing.
  • Allows reader to customize the number and location of cities to attempt to trick the algorithm.

Selected Press (About)

Great paper? Swipe right on the new ‘Tinder for preprints’ app10

Science

N/A

2017 - 2017

  • Story of the app Papr11 made with Jeff Leek and Lucy D’Agostino McGowan.

Swipe right for science: Papr app is ‘Tinder for preprints’12

Nature News

N/A

2017 - 2017

  • Second press article for app Papr.

The Deeper Story in the Data13

University of Vermont Quarterly

N/A

2016 - 2016

  • Story on my path post graduation and the power of narrative.



Selected Press (By)

Wildfires are Getting Worse, The New York Times14

The New York Times

N/A

2016 - 2016

  • GIS analysis and modeling of fire patterns and trends
  • Data in collaboration with NASA and USGS

Who’s Speaking at the Democratic National Convention?15

The New York Times

N/A

2016 - 2016

  • Data scraped from CSPAN records to figure out who talked and past conventions.

Who’s Speaking at the Republican National Convention?16

The New York Times

N/A

2016 - 2016

  • Used same data scraping techniques as Who’s Speaking at the Democratic National Convention?

A Trail of Terror in Nice, Block by Block17

The New York Times

N/A

2016 - 2016

  • Led research effort to put together story of 2016 terrorist attack in Nice, France in less than 12 hours.
  • Work won Silver medal at Malofiej 2017, and gold at Society of News and Design.

The Great Student Migration18

The New York Times

N/A

2016 - 2016

  • Most shared and discussed article from the New York Times for August 2016.

Selected Publications, Posters, and Talks

Building a software package in tandem with machine learning methods research can result in both more rigorous code and more rigorous research

ENAR 2020

N/A

2020

  • Invited talk in Human Data Interaction section.
  • How and why building an R package can benefit methodological research

PheWAS-ME: A web-app for interactive exploration of multimorbidity patterns in PheWAS19

MedRXiv

N/A

2020

  • Manuscript detailing application for the exploration of multimorbidity patterns in PheWAS analyses
  • See landing page20 for more information.

Stochastic Block Modeling in R, Statistically rigorous clustering with rigorous code21

RStudio::conf 2020

N/A

2020

  • Invited talk about new sbmR package22.
  • Focus on how software development and methodological research can improve both benefit when done in tandem.

Taking a network view of EHR and Biobank data to find explainable multivariate patterns23

Vanderbilt Biostatistics Seminar Series

N/A

2019 - 2019

  • University wide seminar series.

Patient-specific risk factors independently influence survival in Myelodysplastic Syndromes in an unbiased review of EHR records

Under-Review (copy available upon request.)

N/A

2019

  • Bayesian network analysis used to find novel subgroups of patients with Myelodysplastic Syndromes (MDS).
  • Analysis done using method built for my dissertation.

Patient specific comorbidities impact overall survival in myelofibrosis

Under-Review (copy available upon request.)

N/A

2019

  • Bayesian network analysis used to find robust novel subgroups of patients with given genetic mutations.
  • Analysis done using method built for my dissertation.

Charge Reductions Associated with Shortening Time to Recovery in Septic Shock24

Chest

N/A

2019 - 2019

  • Authored with Wesley H. Self, MD MPH; Dandan Liu, PhD; Stephan Russ, MD, MPH; Michael J. Ward, MD, PhD, MBA; Nathan I. Shapiro, MD, MPH; Todd W. Rice, MD, MSc; Matthew W. Semler, MD, MSc.

Multimorbidity Explorer | A shiny app for exploring EHR and biobank data25

RStudio::conf 2019

N/A

2019 - 2019

  • Contributed Poster. Authored with Yaomin Xu.

R timelineViz: Visualizing the distribution of study events in longitudinal studies

Under-Review (copy available upon request.)

N/A

2018 - 2018

  • Authored with Alex Sunderman of the Vanderbilt Department of Epidemiology.

Continuous Classification using Deep Neural Networks26

Vanderbilt Biostatistics Qualification Exam

N/A

2017 - 2017

  • Review of methods for classifying continuous data streams using neural networks
  • Successfully met qualifying examination standards

An Agent Based Model of Mysis Migration27

International Association of Great Lakes Research Conference

N/A

2015 - 2015

  • Authored with Brian O’Malley, Sture Hansson, and Jason Stockwell.

Declines of Mysis diluviana in the Great Lakes

Journal of Great Lakes Research

N/A

2015 - 2015

  • Authored with Peter Euclide and Jason Stockwell.

Asymmetric Linkage Disequilibrium: Tools for Dissecting Multiallelic LD

Journal of Human Immunology

N/A

2015 - 2015

  • Authored with Richard Single, Vanja Paunic, Mark Albrecht, and Martin Maiers.